Similarity transformation method for data processing and visualization

ABSTRACT

A similarity transform method of providing parameterized representation of physical or engineering functions for use in retrieving the engineering or physical functions from data, comprising (a) obtaining samples of the functions from data, numerical simulations, or analytic models, (b) extracting generic function shape information from the samples, (c) embedding the function shape information in a parametric discrete grid-based function representation model (forward model); (d) fitting data with the forward model; and (e) retrieving the function from the fitted forward model. The similarity transform method provides a framework for extracting generic function shape information, in the form of non-dimensional shape function, from data, numerical simulations, or analytic model. Thus, the present invention facilitates analysis of general characteristics of a physical or engineering variable, in terms of the dependence of the variable on other variables or parameters.

FIELD OF THE INVENTION

The present invention relates to data processing and visualization. More particularly, it relates to a similarity transformation method of representing physical or computer generated data or functions on a discrete grid of one or more independent coordinates for use in a variety of computer applications. In this regard, this method extracts generic shape information on functions of one or more variables, and provides the means to manipulate a function describing a physical system while maintaining the generic shape of the function for a variety of computer applications, such as, for example, function fitting, inversion of data, graphical display and data visualization, pattern recognition, and data synthesis.

BACKGROUND AND SUMMARY OF THE INVENTION

Applications that extract generic shape information involve the construction of a parametric representation of the data or object of interest, and then manipulating the values of the parameters to cover the range of states that may be realized by the physical or graphical system of interest.

For example, upper atmospheric remote sensing techniques often measure geophysical properties indirectly, requiring that the underlying variable of interest (e.g., species density) be inferred from the data through comparison with a forward model of measurement process. In discrete inverse theory (DIT), the forward model includes a parametric representation of the variable to be retrieved. The data then provides a basis for computing optimal values of the model parameters.

Consider the remote measurement of altitude profiles of upper atmospheric properties (e.g., species densities or temperatures). Measurement techniques include computerized ionospheric tomography and remote sensing of thermospheric and ionospheric composition using ultraviolet limb-scanning or limb-imaging. In the inversion process, one may parameterize the species altitude profile by one of various means, which include: (1) using an analytic function that is perceived to approximate the “true” function; (2) by identifying model parameters with species concentration values on a discrete vertical grid; and (3) through an expansion in a set of basis functions (i.e., splines or empirical orthogonal functions), which are often truncated to increase computational speed.

In order to manipulate a function governing a physical system, while maintaining generic shape of a function, for achieving function fitting, inversion of data, or pattern recognition, construction of a parametric “forward model” of the measurement process may be needed to compute the optimal values of the parameters by systematic comparison of the forward model values with the measured data. The similarity transformation method of the present invention works well with standard algorithms for computing optimal values.

The task of achieving function fitting, inversion of data, and pattern recognition requires the selected parametric representation to be robust in order to access the range of values that a physical system can occupy. The parametric representation must also be constrained to prevent unrealistic or nonphysical states/values to be accessed through manipulation of the parameters. For example, if one uses an overly robust function to attempt a smoothing of noisy data, the function may “fit the noise” rather than the smooth representation desired.

The analytic function approach, as described above, sufficiently constraint the forward model to prevent undue influence by noise. The analytic function approach often requires a minimal number of model parameters to be evaluated. This approach, however, lacks the robustness to capture faithfully all of the possible states of the system or object of interest.

The second and third approaches, as noted above, identify model parameters with species concentration values on a discrete vertical grid, or with coefficients of an expansion in a set of basis functions, respectively, require the evaluation of more model parameters. Further, some form of regularization or a priori information is necessary to ensure smoothness of the retrieved representation in the presence of noise, in order to prevent the models from becoming sufficiently flexible to “fit the noise”, or to become computationally unstable. Thus, there is a need for a method to overcome the problems as identified above.

Accordingly, the present invention proposes a method to overcome the above identified problems. The present invention embeds detailed information on the shape of a physical function in a discrete (grid-based) representation. The present method includes advantages of the analytic function approach without the drawback of having to identify or concoct an analytic representation that is both physically faithful and robust. Detailed shape information may be obtained from past discrete data on the system or function of interest, fields of discrete function values derived from detailed simulations or from analytic theory. The similarity transform method of the present invention enables the determination of universality of function shapes in various models or data sets as functions of environmental conditions, location, time, etc. For example, given a species number density profile that is known or assumed to be typical, the similarity transformation method of the present invention produces a parametric function that ranges over the infinite set of profiles having the same generic shape properties (ordering of local extrema, inflection, points, etc.). This explicit shape constraint ensures smoothness in fitting noisy data by the parameterized function.

The present method provides a framework for extracting generic profile shape information, in the form of a non-dimensional shape function, from observations, physics-based numerical simulations, or analytic theory. In this way, the present method facilitates analysis of general characteristics of species concentration variations with coordinates and with other indexing parameters. For DIT retrievals of species concentration profiles from atmospheric observations, the similarity transform-based forward model embeds the generic (“basis”) shape information directly into a parametric representation of each species profile. The representation may also be used to cover the extraction of non-dimensional shape functions from discrete data or simulations, the basic forward model representation, and generalizations of the basic approach.

In another embodiment, the method of the present invention may be used to represent multivariate functions, as well as single variable functions. For multivariate functions, the method involves division of the basis shape function into contiguous hyper-subsurfaces by partitioning the basis shape function domain into contiguous subsets. Likewise, the forward model domain is also partitioned and mapped with the basis function subsurfaces for corresponding subsets of the forward model domain.

In one aspect, a method of providing parameterized representation of geophysical functions for use in retrieving the geophysical functions from remote sensing data, comprising: obtaining atmospheric measurements; extracting generic profile shape information from the measurements; embedding the profile shape information in a parametric discrete grid-based profile representation model (forward model); and retrieving species concentration profiles from the forward model. The data is preferably obtained by remote sensing systems. The data may also be obtained by numerical simulations. The profile shape information is preferably extracted at every latitude-longitude grid point for maintaining an approximate universality of species profile shape under specific geophysical conditions. The shape information is extracted using Discrete Inverse Theory (DIT). The forward model provides a parameterized representation of a signal without statistical noise (true signal). The values of the forward model are manipulated to fit said forward model to said true signal. The method of providing parameterized representation, as above, is performed to accomplish at least one of function fitting, inversion of data, graphical display and data visualization, pattern recognition, or data synthesis functions.

In another aspect, method for extracting generic shape information on functions having one or more variables, comprising: measuring atmospheric data by remote sensing; defining dimensionless similarity variable and dimensionless shape function; extracting discrete values of the shape function; performing function manipulation and retrieval from the extracted discrete values; selecting a basis function; defining a forward model N_(f) for a ground truth function, the forward model representing an exact profile of the property of interest that underlies the data; performing fitting process on the forward model while maintaining underlying shape function constant; ensuring that the forward model function is similar in shape to the basis function; iterating the step of performing the fitting process if the forward model function is dissimilar in shape to the basis function; and mapping the basis function profile to the forward model.

In yet another aspect, a similarity transform method of providing parameterized representation of geophysical functions for use in retrieving the geophysical functions from remote sensing data, comprising: obtaining function samples; extracting generic profile shape information from the samples; embedding the profile shape information in a parametric discrete grid-based profile representation model (forward model); fitting the forward model to the samples to obtain fitted forward model; and retrieving species concentration profiles from the fitted forward model to retrieve geophysical functions.

Still other objects and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, wherein only the preferred embodiment of the invention is shown and described, simply by way of illustration of the best mode contemplated of carrying out the invention. As will be realized, the invention is capable of different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

A more complete understanding of the present invention may be had by reference to the following Detailed Description when taken in conjunction with the accompanying drawings wherein:

FIG. 1 shows a system for obtaining samples of functions and processing the obtained samples to obtain optimal parameter values in accordance with the present invention;

FIG. 1a illustrates an exemplary profile of the atomic oxygen number density [O](z) as generated from the MSISE-90 empirical model, where g(η) is computed from [O](z), where z is the altitude;

FIG. 1b illustrates a profile of similarity variable, η(z), as a function of altitude;

FIG. 1c illustrates a profile of a shape function, g[η(z)] versus [O](z);

FIG. 1d illustrates a profile of the shape function g[η(z)] versus η;

FIGS. 2a-2 c show non-linear least squares fits of similarity transform-based model profiles to respective exact species density profiles, the triangles illustrating the profiles used to initialize the fitting calculations;

FIGS. 2d-2 f show the ratios of the fitted profiles to the corresponding exact profiles as shown in FIGS. 2a-2 c, respectively;

FIGS. 3a-3 c show fits of a state-of-the-art MSIS-based forward model to MSISE-90 altitude profiles of N₂, O₂, and [O], respectively;

FIGS. 3d-3 f show the ratios for the corresponding profiles of FIGS. 3a-3 c, respectively;

FIGS. 4a-4 c show the mapping of one segment of the basis shape function onto the corresponding semiopen subinterval within the overall domain of the forward model;

FIGS. 5a-5 c show fits (for L=2) of forward model to MSISE-90 altitude profiles of N₂, O₂, [O], respectively;

FIGS. 5d-5 f show the ratios for the corresponding profiles of FIGS. 5a-5 c, respectively;

FIG. 6 shows a detailed flow chart for similarity transform method that provides parametric representation of geophysical functions for use in retrieving such functions from remote sensing observations;

FIG. 7 shows an overall flow chart for similarity transform method as in FIG. 6;

FIG. 8 shows details of fitting process to obtain optimal model parameter values as shown in FIGS. 6 and 7.

DETAILED DESCRIPTION OF THE INVENTION

In the drawings, like or similar elements are designated with identical reference numerals throughout the drawings, and the various elements depicted are not necessarily drawn to scale.

FIG. 1 shows a system for obtaining samples of functions and processing the obtained samples to obtain optimal parameter values in accordance with the present invention. Here, a satellite system 10 may be used to scan earth's atmosphere in order to measure and obtain samples of a variety of functions. For example, profiles of the atomic oxygen may be obtained as a function of altitude. The measures spectrum may then be transmitted to earth via a wireless communication network. Any known communication protocols may be used in order to communicate the measured information from satellite 10 to a ground based processor system 12. The ground-based processor system 12 may be a computer system having logic to process the information, received from the satellite 10, to determine optimal similarity transform parameters and to retrieve optimal parameter profiles. Model parameter values computed by the processor system 12 may be stored in a database system 14 which may be local or remote to the processor system 12.

Section 1

With respect to FIG. 6, consider a function N(z), describing the variation of an atmospheric property, such as, for example, species density or its logarithm) with altitude z, over a domain [z₀, z_(M)]. Assume that a physical system or function has a true or noiseless values {N(z_(i)); i=0, 1, . . . , M} on a monotonically increasing discrete grid {z_(i); i=0, 1, . . . , M}, so that z₀<z₁< . . . <z_(M). Define a dimensionless similarity variable as shown in Equation 1: $\begin{matrix} {{\eta (z)} \equiv \frac{z - z_{0}}{z_{M} - z_{0}}} & (1) \end{matrix}$

Now, define a dimensionless shape function as shown in Equation 2 $\begin{matrix} {{{{g\left( {\eta (z)} \right)} \equiv \frac{{N(z)} - N_{0}}{N_{M} - N_{0}}} = \frac{{N\left( {{\left\lbrack {z_{M} - z_{0}} \right\rbrack {\eta (z)}} + z_{0}} \right)} - N_{0}}{N_{M} - N_{0}}},} & (2) \end{matrix}$

Where for example N(z)=log [O](z), where “log” denotes the narural Iogarithm. “z” varies from the lowest to the highest arid points defining the altitude profile: where N₀≡N(z₀) and N_(M)≡N(z_(M)) and N₀≠N_(M). Notice that η varies from 0 to 1, linearly with z, and that g(0)=0 and g(1)=1. Equations (1) and (2) then allow us to express the function N(z) in terms of the similarity variable and shape function:

N(z, m)≡N ₀ +g(η(z))[N _(M) −N ₀]  (2.1)

where the model parameter vector m=[z₀, z_(M), N₀, N_(M)] is included to point out the dependence of N, η, and g on the parameters z₀, z_(M), N₀, and N_(M).

From the discrete function, Equations (1) and (2) may be used to extract the discrete values of the shape function g_(i)=g(η_(i))=g(η(z_(i))). FIG. 1(a) shows an example derived from a Mass Spectrometer Incoherent Scattered Empirical (MSISE-90) calculation of the atomic oxygen number density, [O](z), covering several decades. In this example, N(z) is log [O](z), where “log” denotes the natural logarithm. In FIGS. 1(b) and 1(d), both η and g vary from 0 to 1 as “z” varies from the lowest to the highest grid points defining the altitude profile. FIG. 1(c) further reveals a linear relationship between the shape function and the corresponding logarithmic density profile, the profile being consistent with Equation (2). The shape function therefore captures the manner in which the atmospheric property varies over its domain, independent of the actual range of physical values or the size of the domain.

The non-dimensionalization in Equations (1) and (2) places all functions describing a given system in the domain [z₀, z_(M)] on a more equal footing and, for example, permits a direct comparison of profiles of a given atmospheric property under different conditions or at different locations. To obtain values of the function or system property at a particular value of z, where z∉{z_(i)} but z₀ z z_(M), one must interpolate.

Section 2

The terminology of Discrete Inverse Theory (DIT) is adopted for the discussion herein. By convention, the subscripts “b”, “d”, and “f” signify the “basis function”, “the ground truth function underlying the data”, and the “forward model”, respectively. First, select a basis function, with a shape function g_(b) that is expected to provide an acceptable representation of, or fit to, the shape function g_(d) of the ground truth function N_(d) that underlies the data. Using the shape extraction method as in Section 1, studies of numerical simulations, of direct observations or of analytic models, would provide information on the specific basis shape function(s) that would be appropriate, or a user can assume that a specific sample function is adequate, and then test the assertion by application to actual data or numerical simulations.

In practical situations, the basis function comprises of discrete values N_(b)(z_(bi)), defined at M′+1 points Z_(b) {z_(bi), I=0, 1, . . . , M′}. Equations (1) and (2) may then be used to provide the basis shape function values {g_(b)(η_(b)(z_(bi))), where η_(b)(z_(b0))=0 and η_(b)(z_(bM′))=1. Obtaining values g_(b)(η_(b)(z)) for z∉ Z_(b) with z_(b0)<z<z_(bM′) requires interpolation within the set Z_(b). For example, quadratic and spline interpolation may be used. On the other hand, extrapolation outside of the basis function domain is arbitrary, and therefore may not be optimal. The use of constraints on forward model parameters, such as, for example, (z_(fB) and z_(fT)) during inversion calculations may prevent extrapolation.

Given the basis shape function, Equations (1) and (2) may also be used to define a forward model N_(f) for N_(d)(z_(di)), the ground truth function of the system property of interest which underlies the data. In order to fit direct observations of N(z), the forward model is evaluated at the data grid points {z_(di); I=1, 2, . . . , M}. For indirect observations, a user may select points at which the forward model is to be evaluated. The model parameter vector, to be evaluated from the data by DIT, is m [z_(fB), z_(fT), N_(fB), N_(fT)], where “B” and “T” denote “bottom” and “top”, so that z_(fB)<z_(dl)<z_(dM)<z_(fT), N_(fB) N_(f)(z_(fB)), and N_(fT) N_(f)(z_(fT)). The forward model similarity variable corresponding to the point, z_(di), is shown in Equation (3): $\begin{matrix} {{\eta_{f}\left( z_{di} \right)} \equiv \frac{z_{di} - z_{fB}}{z_{fT} - z_{fB}}} & (3) \end{matrix}$

and the forward model value for the retrieved property at that location is

 N _(f)(z _(di))≡N _(fB) +g _(f)(η_(f)(z _(di)))[N _(fT) −N _(fB)]  (4)

Defining g_(f[i])≡g_(f)(η_(f)(z_(di))) and g_(b[i])≡g_(b)(η_(f)(z_(di))), where the square brackets distinguish the data point index from the subscripts “f” and “b” and from the basis grid indices, one may complete the forward model by identifying the vector of shape function values g_(f)≡[g_(f[1]), g_(f[2]), . . . , g_(f[i]), . . . , g_(f[M])] with the corresponding values of the basis shape function g_(b)≡[g_(b[1]), g_(b[2]), . . . , g_(b[i]), . . . , g_(b[M])], i.e.,

g _(f) ≡g _(b)  (5)

This ensures that all forward model functions will be similar in shape to the basis function. During the fitting process, (η_(f)(z_(di)) may change from iteration to iteration for each “i”, so that the vector g_(f) may also change. On the other hand, the underlying shape function g_(f)(η)≡g_(f)(η) does not change. It should be noted that in the terminology for generalization of the present method for functions of a single variable, Equations (3)−(5) define an “L=1” forward model, where L is the number of contiguous segments of “g” that are being mapped to the data or to an exact profile.

Equations (3) and (4) show that manipulation of “m” permits to shift the basis function to higher or lower values of N_(f) and to stretch or compress the basis function to ground truth function N_(d)(z_(di)). It should be noted that compression may degrade accuracy by forcing the fitting code to extrapolate the basis shape function beyond the domain on which g_(b) is defined, i.e., to η_(f) outside the interval [0,1]. The transformation represented by varying “m” is generally referred to as “similarity transformation”, since the transformation maintains the non-dimensional shape characteristics embedded in g_(b)(η).

Consider the following example to demonstrate the retrieval process and to permit potential users to test the method of the present invention. The simulated data comprises of MSISE-90 profiles of neutral species number densities ([N₂]_(d), [O]_(d), [O₂]_(d)), evaluated at altitudes z_(di) in the interval [120, 450] km. The specific thermospheric conditions correspond to latitude 67.5° and longitude 220° during a major geomagnetic storm: year 1982, day 195, local time 0900 hr, Ap 153, F_(10.7)=260, and 81-day average (F_(10.7))=168. Note that 3-hr ap inputs were used in MSISE-90 and that the Ap value is given only for perspective on this pathologically active day. Fitting the natural logarithm of the data, {N_(sd)(z_(di))log([x_(s)]_(d)(z_(di))); s=1, 2, 3; x₁=N₂, x₂=O₂, x₃=O; i=1, 2, . . . , M}, provided the best results. For this exemplary illustration, the synthetic data does not include noise. Consequently, the covariance of the data, [cov d], was set to the identity matrix, so that χ² is the sum over species of the squared residuals at the data grid points (unweighted nonlinear least squares fit). Because the shape function g_(sf)(η) does not vary from iteration to iteration, convergence has always occurred using the present method.

Basis and Forward Model Shape Functions

In order to calculate the basis profile, considering similar thermospheric conditions, but specifying latitude and longitude to be −2.5° and 140°, respectively. Select a basis grid with z_(bi) in the interval [z_(b0), z_(bM′)]=[102, 923] km, where the number of points is M′+1=26. FIG. 1 shows the basis values for atomic oxygen under these conditions. Given the log values of the basis density for each species “s” at the basis grid points, i.e., {N_(sb)(z_(bj))≡log ([x_(s)]_(b)(z_(bj))); s=1, 2, 3,; j=0, 1, . . . , M′}, use Equations (1) and (2) to generate separate shape functions {g_(sb)(η_(b)(z_(bj))); s, j ranging} for the species.

Then, at each iteration of the fitting process, for each species, and at each “data” grid point z_(di), evaluate the similarity variable η_(sf) (z_(di)) using Equation (3), and then interpolate the species basis shape function values to the data grid z_(di) values g_(sf)=g_(sb), as in Equation (5). Equation (4) may be used to compute separate forward model values for each species on the “data” grid {z_(di)}.

Initialization of the Model Parameter Vector

The model parameter vector is m=[z_(fB)(N₂), z_(fT)(N₂), z_(fB)(O₂), z_(fT)(O₂), z_(fB)(O), z_(fT)(O), log [N₂]_(fB), log [N₂]_(fT), log [O₂]_(fB), log [O₂]_(fT), log [O]_(fB), log [O]_(fT)]. Denote initial model parameter values by superscript “0” and choose the component values of m⁰ to be identical with the basis values, i.e., z_(fB) ⁰(N₂)=z_(b0), z_(fT) ⁰(N₂)=z_(bM′), etc. For every species x_(s), this ensures that z_(fB) ⁰(x_(s))<z_(d1) and z_(fT) ⁰(x_(s))>z_(dM) and that η_(sf) ⁰(z_(di))ε[0,1] for every data grid point z_(di). Thus the data grid falls entirely within the forward model altitude domain, a situation which should be maintained during the fitting or inversion process. Failure to do so for a given species x_(s) would result in up to two non-null subsequences, α_(L) and α_(U), of the data grid indices, such that the corresponding subset of η_(sf)-coordinate values would be outside of the unit interval, i.e., {η_(f)(z_(di)); iεα_(L)∪α_(U)}[0,1], causing extrapolation of the shape function. On these subsets, the information embedded in g_(b) would not be entirely useful, and the overall inversion results would be unpredictable.

Section 3

In this section, a piecewise fitting of the basis shape function is adopted for contiguous subsets of the data vector or of an exact profile, thus permitting the definition of separate, but connected, forward models for the contiguous subsets, with each forward model stretching or shifting a portion of the basis shape function in order to achieve an optimal fit to the respective subset. FIG. 4(b) shows the mapping of one segment of the basis shape function, excluding the endpoint at η_(bk), onto the semiopen subinterval [z_(k−1), z_(k)) within the overall domain [z_(B), z_(T)] of the forward model. The adjacent segments of the shape function, shown by dashed lines, similarly map onto corresponding (dashed lines) subintervals of the forward model domain. The fitting calculation determines the optimal locations of the subinterval boundary points.

In a third embodiment, the basis function g_(b) may be split into an ordered set of L elements, comprising of contiguous sections that may be held “fixed” during the fitting process. This approach involves partitioning the domain of g_(b) (i.e., η_(b)ε[0, 1]) into an ordered set of L contiguous subintervals. Likewise, the forward model is partitioned into altitude interval [z_(fB), z_(fT)] (or equivalently, partition the η_(f) interval [0, 1]) into the same number of (L) subintervals, whose boundary locations {η_(fk)} and function values {N(η_(fk))} serve as additional model parameters to be optimized by the fitting process. For data points falling in a given subinterval of the forward model domain, apply the method as in Section 1, using the corresponding segment of gb to define the forward model, while noting that g_(b)=g_(f), as in Equations (3) and (4). This requires remapping of g_(b), η_(b), and η_(f) in each of the respective subintervals to the unit interval [0, 1]. FIG. 4(c) depicts the remapped segment of g_(b), denoted γ, as a function of the remapped similarity variable ζ. Further elaboration of the above techniques are found in Appendix I attached hereto and the contents of which are incorporated herein by reference, as disclosing an article by Picone et al. entitled Similarity Transformations for Fitting of Geophysical Properties: Application to Altitude Profiles of Upper Atmospheric Species.

In a fourth embodiment, partition the basis function domain and the forward model domain into ordered sets of L contiguous subintervals, remap the subinterval g and η functions, as in the third embodiment. However, the boundary locations of the forward model domain subintervals are held fixed while treating the basis segment endpoint locations η_(bk) as model parameters to be varied. The values of the forward model {N(η_(fk))} remain as model parameters.

In a fifth embodiment, the approach of third and fourth embodiments may be combined by treating both the segment endpoint locations η_(bk) of the basis function domain and the corresponding segment endpoint locations η_(fk) of the forward model domain as model parameters, along with {N(η_(fk))}.

In yet another embodiment, the method of the present invention is extended to represent multivariate functions. Assuming that a physical system or function has a true or noiseless values {N(x_(i), y_(j), z_(k)); i=0, 1, . . . , I; j=0, 1, . . . J; k=0, 1, . . . K} on a monotonically increasing discrete grid so that

x₀<x₁< . . . <x_(I); y₀<y₁< . . . <y_(J); z₀<z₁< . . . <z_(K). Define a dimensionless similarity variable as shown in Equation 6: $\begin{matrix} {{{\eta_{x}(x)} \equiv \frac{x - x_{0}}{x_{1} - x_{0}}},{{\eta_{y}(x)} \equiv \frac{y - y_{0}}{y_{1} - y_{0}}},\ldots \quad,} & (6) \end{matrix}$

and a dimensionless shape function as shown in Equation 7 $\begin{matrix} {{{g\left( {{\eta_{x}(x)},{\eta_{y}(y)},\ldots} \right)} \equiv \frac{{N(z)} - N_{0}}{N_{I,J,{\ldots \quad K}} - N_{0}}},} & (7) \end{matrix}$

where N₀=N(x₀, y_(0, . . . ,) z₀) and N_(IJ . . . K)=N(x_(I), Y_(J), z_(K)).

A multivariate forward model is then created. Given the basis shape function, in order to fit direct observations of N(z), the forward model is evaluated at the data grid points {R_(d)(i, j, . . . k)≡(x_(di), y_(dj), . . . , z_(dk)); i=1, 2, . . . , I_(d); j=1, 2, . . . , J_(d); . . . ; k=1, 2, . . . , K_(d); ranging}. For indirect observations, a user selects points at which the forward model is to be evaluated. The model parameter vector, to be evaluated from the data by DIT, is m [x_(fB), x_(fT), y_(fB), y_(fT), . . . , z_(fB), z_(fT), N_(fB), N_(fT)], where “B” and “T” denote “bottom and “top”, so that

z_(fB)<z_(dl)<z_(dK)<z_(fT), N_(fB)≡N_(f)(z_(fB)), and N_(fT)≡N_(f)(z_(fT)) The forward model similarity variable corresponding to the point R_(d)(i, j, . . . k) are given by the following equation: $\begin{matrix} {{{\eta_{fx}\left( x_{di} \right)} \equiv \frac{x_{di} - x_{fB}}{x_{fT} - x_{fB}}},{{\eta_{fy}\left( y_{dj} \right)} \equiv \frac{y_{dj} - y_{fB}}{y_{fT} - y_{fB}}},\ldots \quad,} & (8) \end{matrix}$

and the forward model value for the retrieved property at that location is given by the following equation:

N _(f)(R _(d))≡N _(fB) +g _(f)(η_(fx)(x _(di)), η_(fy)(y _(dj)), . . . )[N _(fT) −N _(fB)]  (9)

Defining g_(f[d])≡g_(f)(η_(fx)(x_(di)), η_(fy)(y_(dj)), . . . ) and g_(b[d])≡g_(b)(η_(fx)(x_(di)), η_(fy)(y_(dj)), . . . ) where d≡(i, j, . . . , k) is an n-tuple of integers labeling a vector of indices and runs from 1 to D≡(I_(d), J_(d,), . . . , K_(d)). The square brackets distinguish the data point index from the subscripts “f” and “b”. From the basis grid indices, one may complete the forward model by identifying the vector of shape function values g_(f)≡[g_(f[1]), g_(f[2]), . . . g_(f[d]), . . . g_(f[D])] with the corresponding values of the basis shape function g_(b)≡[g_(b[1]), g_(b[2]), . . . g_(b[d]), . . . g_(b[D])]. Equation (5) also holds true in this case.

This ensures that all forward model functions will be similar in shape to the basis function. During the fitting process, η_(fx)(x_(di)), η_(fy)(y_(dj)), . . . will change from iteration to iteration for each d, so that the vector g_(f) will also change. On the other hand, the underlying shape function g_(f)(η_(x), η_(y), . . . )≡g_(b)(η_(x), η_(y), . . . ) does not change.

In a further embodiment, the method of the present invention also applies to system functions which require different basis shape functions (i.e., basis shape functions with different numbers and configurations of saddle points, maxima, and minima) in different sub-domains. Here, fitting or conversion would apply different forward models to different sub-domains.

The present method also facilitates to capture the shape of an N-dimensional hypersurface by generalizing Equations (1) and (2) to the case of N-similarity variables. Manipulating the model parameters in the N-dimensional generalization of Equation (3) would produce a class of surfaces that were similar. This approach may be useful in visualization and graphics applications, in studies of functional similarity, and in comparisons of coincident data or model results.

The present invention is a novel method of incorporating the knowledge of generic system function shape properties into fitting functions or forward models for data inversion. Further, the present invention also represents a unique way for manipulating a class of system functions for use in data visualization, display, or synthesis. For given data sets or detailed numerical simulations describing an engineering or physical system, the present invention represents (i) a new method of determining generic system function shape properties, (ii) provides a new, efficient method of embedding generic properties into fitting function or a parameterized representation of a class of functions for data and system visualization. Also, for any system variables on which data or numerical simulations are available, the present invention provides a fitting function or forward model with optimal robustness (flexibility) for smoothing or inverting noisy data with optimal (i.e., neither too much nor too little flexibility and neither too many nor too few model parameters) numbers of parameters to be determined from inversion or fitting calculations in the presence of noise. The present invention achieves optimal computational complexity due to the optimal robustness.

Although various embodiments of the present invention are discussed with applications to geophysical functions, such discussion should only be considered as exemplary. It will be understood that the present invention equally applies to the parameteric representation of any real-world function, including engineering functions or physical functions.

It is believed that the operation and construction of the present invention will be apparent from the foregoing Detailed Description. While the apparatus and method shown and described have been characterized as being preferred, it should be readily understood that various changes, modifications and enhancements could be made therein without departing from the scope of the present invention as set forth in the following claims.

Accordingly, those skilled in the art should readily appreciate that these and other variations, additions, modifications, enhancements, et cetera, are deemed to be within the ambit of the present invention whose scope is determined solely by the following claims as properly interpreted in accordance with the doctrine of equivalents. 

What is claimed is:
 1. A method for extracting generic shape information of a function, said method comprising: a) obtaining samples of the function; b) defining dimensionless similarity variable and dimensionless shape function; c) extracting discrete values of the shape function from samples; d) performing function manipulation and retrieval from the extracted discrete values; e) selecting a basis function; f) mapping the basis function profile to the forward model to ensure that the forward model function is similar in shape to the basis function; g) defining a forward model N_(f) for a ground truth function, the forward model being adjustable to represent an exact profile of the property of interest that underlies data; h) performing fitting process of the forward model to sample data while maintaining underlying shape function constant to obtain a fitted forward model; and i) retrieving the profile parameter values from the fitted forward model.
 2. The method of claim 1, wherein the dimensionless similarity variable is defined according to the equation ${\eta (z)} \equiv {\frac{z - z_{0}}{z_{M} - z_{0}}.}$


3. The method of claim 1, wherein the dimensionless shape function is defined according to the equation $\begin{matrix} {{{{g\left( {\eta (z)} \right)} \equiv \frac{{N(z)} - N_{0}}{N_{M} - N_{0}}} = \frac{{N\left( {{\left\lbrack {z_{M} - z_{0}} \right\rbrack {\eta (z)}} + z_{0}} \right)} - N_{0}}{N_{M} - N_{0}}},} & (2) \end{matrix}$

where for example N(z)=log [O](z), where “log” denotes the natural logarithm, “z” varies from the lowest to the highest grid points defining the altitude profile.
 4. The method of claim 3, wherein the discrete values of the shape function are extracted such that g_(i)=g(η_(i))=g(η(z_(i))).
 5. The method of claim 1, wherein the basis function as in step(e) comprises discrete values N_(b)(z_(bi)), defined at M′+1 points such that Z_(b)≡{z_(bi), i=0, 1, . . . , M′}.
 6. The method of claim 5, wherein said basis function is calculated using at least one of quadratic or spline interpolation techniques.
 7. The method of claim 1, wherein step(g) further comprising: i) evaluating, for direct observations, the forward model at data grid points {z_(di); i=1, 2, . . . , M}; ii) evaluating, for indirect observations, the forward model by selecting data points at which the forward model is to be evaluated; iii) evaluating a model parameter vector m≡[z_(fB), z_(fT), N_(fB), N_(fT)], where “B” and “T” denote “bottom” and “top”, such that z_(fB)<z_(dl)<z_(dM)<z_(fT), N_(fB)≡N_(f)(z_(fB)), and N_(fT)≡N_(f)(z_(fT)).
 8. The method of claim 7, further comprising: iv) shifting the basis function to higher or lower values of N_(f) by manipulating the model parameter vector (m).
 9. The method of claim 1, wherein the ensuring step (f) is performed by defining g_(f[i])≡g_(f)(η_(f)(z_(di))) and g_(b[i])≡g_(b)(η_(f)(z_(di))).
 10. The method of claim 1, wherein the shape information is extracted according to $\begin{matrix} {{{\eta (z)} \equiv \frac{z - z_{0}}{z_{M} - z_{0}}}\text{and}{{{{g\left( {\eta (z)} \right)} \equiv \frac{{N(z)} - N_{0}}{N_{M} - N_{0}}} = \frac{{N\left( {{\left\lbrack {z_{M} - z_{0}} \right\rbrack {\eta (z)}} + z_{0}} \right)} - N_{0}}{N_{M} - N_{0}}},}} & (2) \end{matrix}$

where for example N(z)=log [O](z), where “log” denotes the natural logarithm, “z” varies from the lowest to the highest grid points defining the altitude profile.
 11. The method of claim 1, wherein the measured profiles as in step(a) are obtained using remote sensing devices.
 12. The method of claim 10, wherein samples of atmospheric measurements as in step(a) are obtained using one or more of (i) remote measurements, (ii) numerical simulations, (iii) analytic models. 